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Abstract: In this paper, a method via sparse-sparse iteration for computing a 
sparse incomplete factorization of the inverse of a symmetric positive definite ma- 
'^ I trix is proposed. The resulting factorized sparse approximate inverse is used as 

J3 ' a preconditioner for solving symmetric positive definite linear systems of equa- 

tions by using the preconditioned conjugate gradient algorithm. Some numerical 
experiments on test matrices from the Harwell-Boeing collection for comparing 
. , the numerical performance of the presented method with one available well-known 

Ti^ij- ■ algorithm are also given. 
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/\' 1. Introduction 

Consider the nonsingular linear system of equations 

Ax = b, (1) 

where the coefficient matrix A G M"^" is large, sparse and x,b E M". It is well- 
known that the rate of convergence of iterative methods such as Krylov subspace 
methods for solving ([1]) is strongly influenced by the spectral properties of A. 
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Hence, iterative methods usually involve a second matrix that transforms the co- 
efficient matrix into one with a more favorable spectrum. The transformation 
matrix is called a preconditioner. If M is a nonsingular matrix that approximates 
the inverse of A [M ^ A~^), then the transformed linear system 

AMy = b, x = My, (2) 

will have the same solution as system ([1]), but the convergence rate of iterative 
methods applied to ([2]) may be higher. System ([2]) is preconditioned from the 
right, but left preconditioning is also possible, i.e., MAx = Mb. One can also 
define split-preconditioned systems. Let us assume that A has the LU factorization 
and 

M = MuMl, where Mu ^ U'^ and Ml ^ L^\ (3) 

where L and U are the lower and upper triangular factors of A. This type of 
preconditioning is known as factorized approximate inverses and Mjj and Ml are 
called approximate inverse factors of A. Here, the transformed linear system can 
be considered as follows 

MuAMLy = Mub, x = MLy. (4) 

System (jlj) is called a split-preconditioned system. 

In this paper we focus our attention on the computation of sparse approximate 
inverse factors of a matrix. There are different ways to compute sparse approximate 
inverse factors of a matrix and each of them has its own advantages and disadvan- 
tages. In [HI [7], the AINV method was proposed which is based on an algorithm 
which computes two sets of vectors {zi}'^^^ and {wj}"^^ which are A-biconjugate, 
i.e., such that wfAzj = if and only ii i ^ j. Although the construction phase 
for the original AINV algorithm is sequential, its application is highly parallel, 
since it consists of matrix-vector products. A fully parallel AINV algorithm can 
be achieved by means of graph partitioning (see [2l |3[ [8]). For symmetric positive 
definite (SPD) matrices, there exists a variant of the AINV method, denoted by 
SAINV (for Stabilized AINV), that is breakdown- free [3]. Another approach which 
was proposed by Kolotilina and Yeremin is the FSAI algorithm fHX W2\ . They 
assume that A is SPD and then construct factorized sparse approximate inverse 
preconditioners which are also SPD. Each factor implicitly approximates the in- 
verse of the lower triangular Cholesky factor of A. This method can be easily 



extended to the nonsyinmetric case. The FSAI algorithm is inherently parallel but 
its main disadvantage is the need to prescribe the sparsity of approximate inverse 
factors in advance. 

In this paper, we first propose an iterative method for solving SPD linear sys- 
tems of equations, and then, by exploiting this method, we develop an algorithm 
for computing an incomplete factorization of the inverse of an SPD matrix. The 
resulting factorized sparse approximate inverse is used as an explicit preconditioner 
for the solution of Ax = bhy the preconditioned conjugate gradient (PCG) method. 

Throughout the paper \\z\\a stands for the A-norm of any vector z, i.e., ||2;||^ = 
(Az, zY^'^. We will denote the largest and smallest eigenvalues of the matrix X by 
>^max{X) and Xmin{X), respectively. 

This paper is organized as follows. In section 2, we introduce an approach for 
computing a sparse approximate solution of an SPD linear system of equations. 
Section 3 is devoted to computing an incomplete factorization of the inverse of an 
SPD matrix. Numerical experiments are given in section 4. Finally, we give some 
concluding remarks in section 5. 



2. Sparse approximate solution of an SPD linear system of equations 

In this section, we first present an approach, based on the projection method, 
for solving an SPD linear system of equations. Then we develop an algorithm for 
computing a sparse approximate solution of an SPD linear system of equations. 

Let A = (ttij) be an SPD matrix and let us consider a projection method with 
C = K = span{ejj, ej2, . . . ,ej„}, where Cj^. is the i^-th column of identity matrix 
and m is a small natural number. Given an initial guess x of the solution of ([1]) 
and the residual vector r = b — Ax, the new approximation takes the form 

for some y G M™", and E = [cj^, Cjj, • • • , Cj^]. The Petrov-Galerkin condition r — 
AEy ± C yields 

y = {E'^AEY^E'^r. (6) 

As known [15] , this kind of update minimizes 



\X + Ey - Xexact\\A 



over all y G M"*, where x exact is the exact solution of Ax = b. It is obvious that the 
matrix S = E'^AE is an SPD matrix of dimension m. Defining J' = {ii,i2, . . . , im}, 
the matrix E'^AE is the principal submatrix of A consisting of the rows and columns 
whose indices are in J'. This new approach for solving an SPD linear system of 
equations can be stated as follows. 

Algorithm 1: 

1. Choose an initial guess x and compute r = b — Ax 

2. Until convergence, Do 

3. Select J = {ii,i2, • • • ,«m} ^ {1, 2, . . . ,n} 

and E := [a.^ei^, . . . ,eij 

4. Solve {E^AE)y = E^r for y 

5. Compute x := x + Ey 

6. Compute r := r — AEy 

7. EndDo 

Step 6 of this algorithm can be written as 

r ■=r-^yk a.,^k, (7) 

where a:^k is the k-th column of A and y = [yi,y2, ■ ■ ■ , ym\^ ■ The relation (jTj) shows 
that, for updating r, we need m sparse SAXPY operations (a SAXPY operation 
is defined as z := x + ay, where x and y are n- vectors and a is a scalar). The 
following theorem regarding the convergence rate of the Algorithm 1 can be stated. 

Theorem 1. Let A be a symmetric positive definite m,atrix. Assum,e that, at each 
projection step, the selected index set {ii, ^2, • • • , im] contains the indices of the m 
components with largest absolute value in the current residual vector r = b — Ax. 
Then 

" — lU l|2 > l^kej^'^k ^Q^, 



A \\^new\\A 



z^kej ^'^k 



and 

WdneM < (1 - ( ^— (^) )S^)f"\\dU (9) 

where dnew = A^^b — Xnew Ojf^d d = A^^b — x. Relation ^ shows that Algorithm 1 
converges for any initial guess. 

Proof. We start by observing that dnew = d — Ey, Ad = r, and 

{Adnew, dnew) = {Ad, d) ~ (?/, E'^v). 

From ([6]) and using the Courant-Fisher min-max theorem [T], [15] , we have 

{y,E^r) = {{E^AE)-^E^r,E^r) 
> 



\-^ ' II2 



> 



X^^aAE'^AE) 

||^^r||2 



and 



II l|2 

{Ad,d) = {r,A-^r)< "^"^ 



'^iriin\-^) 

From these observations the desired results immediately follow. Relation (jH]) es- 
tablishes the convergence of the method, since 'Ylkej^'^k — Sfc=i '"I ^^^ Xmin{A) < 
\min{E^AE) < Ek^jakk (see P). D 

This theorem not only shows the convergence of the algorithm but also the rate 
of the reduction in the square of the A-norm of the error (Eq. ([H])). In fact, the 
indices ij,j = 1, . . . ,m are chosen in such a way that the reduction in the square 
of the A-norm of the error is as large as possible. If ^4 is a symmetric diagonally 
scaled matrix then J^kej'^kk = ^ and in the Eqs. ([8]) and ([9]), JZk&j'^i^k may be 
replaced by m. 

Now, by using Algorithm 1, we propose an algorithm to compute a sparse 
approximate solution of an SPD linear system of equations. In this algorithm 
no dropping strategy is needed and sparsity of the solution is preserved only by 
specifying the maximum number of its nonzero entries, Ifil, in advance. In each 



iteration at most m {m <^n) entries are added to the current approximate solution. 
This algorithm can be stated as follows. 

Algorithm 2 : Sparse approximate solution to the SPD system Ax = b 

1. Set X := and r := b 

2. While II r|| > eps and nnz{x) < Ifil Do 

3. Select the indices of m components with largest absolute value in the 

current residual vector r, i.e., J = {zi, 22, • • • , ^m} ^ {1, 2, . . . , n} 

and set E := [cj,, e^^, . . . , Cj^] 

4. Solve {E^AE)y = E^r for y 

5. Compute x := x + Ey 

6. Compute r := r — AEy 

7. EndDo 



The vector x computed by Algorithm 2 has at most Ifil nonzero entries. In 
practical implementations of Algorithm 2 the number m is usually chosen to be 
too small, for example m = 1, 2 or 3. Throughout this paper we take m = 2. The 
parameter eps is used for stopping the process when the residual norm is small 
enough. As can be seen, in Algorithm 2 no dropping strategy is used and in each 
step of the algorithm, according to Theorem 1, the A-norm of the error is reduced. 



3. Approximate inverse factors of a matrix via sparse-sparse iterations 

In this section, for computing a sparse factorized approximate inverse of an SPD 
matrix, we combine Algorithm 2 of section 2 with the AIB (Approximate Inverse 
via Bordering) algorithm proposed by Saad in pLSj . We first give a brief description 
of the AIB algorithm for symmetric matrices. 

In the AIB algorithm, the sequence of matrices 

A.+l=f^T ""' ], (10) 



is made in which An = A. If the inverse factor Uk is available for A^, i.e., 




U^AkUk = Dk, 


(11) 


then the inverse factor Uk+i for A^+i will be obtained by writing 




r u^ o\r A, V, \r u, -z,\_r D, o \ 


(12) 


in which 




AkZk = Vk, 


(13) 


Sk+i = Ofc+i - z'j^Vk- 


(14) 



Relation (1141) can be exploited if the system (1131) is solved exactly. Otherwise we 
should use 

Sk+i = Oik+i - vlzk - zl{vk - AkZk) 
= ttfc+i - v^Zk - zjrk 
= ak+i- zl{vk + rk), (15) 

instead of (fT^ . where r^ = Vk — AkZk- Starting from A; = 1, this procedure suggests 
an algorithm for computing the inverse factors of A. If a sparse approximate 
solution of (1131) is computed, then an approximate factorization of A~'^ is obtained. 
To do this, we use Algorithm 2 of section 2. This scheme can be summarized as 
follows. 

Algorithm 3. AIB algorithm 

1. Set Ai = [an], Ui = [1] and 6i = an 

2. For k = 1, . . . ,n — 1 Do: (in parallel) 

3. Compute a sparse approximate solution to AkZk = Vk, 
by using Algorithm 2, and the residual Vk = Vk — AkZk- 

4. Compute 6k+i = a^+i - zl{vk + r^). 

5. Form Uk+i and Dk+i 

6. EndDo. 

7. Set U := Un and D := D^ 



This algorithm returns U and D such that U'^AU ~ D. The following theorem 
shows that 6k+i is always positive, independently of the accuracy with which the 
system flT3|) is solved. 



Theorem 2. Let A be an SPD matrix. Then, the scalar 6k+i computed in step 4 
of Algorithm 3 is positive. 

Proof. Let r^ be the residual obtained in step 3 of the AIB algorithm, i.e., 

rk = Vk- AkZk. 
Hence, we have 

Zk = A'^^{Vk -Tfc). 

By a little computation one can see that 



^k+i = ak+i - vlA^ ^Vk + rlAf^ V^ = s + r JA^ ^ 



rk 



where s = ak+i — v^A^-^Vk G M is the Schur complement of Ak+i and is a positive 
real number (see Theorem 3.9 in [T]). So, the scalar 6k+i is positive, since A is an 
SPD matrix and r'J^A^'^rk > for rk j^ 0. D 

Hence the AIB algorithm is well-defined for SPD matrices. 

4. Numerical examples 

All the numerical experiments presented in this section were computed in double 
precision using Fortran PowerStation version 4.0 on a Pentium 4 PC, with a 3.06 
GHz CPU and 1.00GB of RAM. 

For the first set of the numerical experiments, we used nine SPD matrices (BC- 
SSTK* and S*RMT3M*) from the Matrix-Market website [H] and three matrices 
(EX15, MSC04515 and KUU ) from Tim Davis's collection [TO]. These matrices 
with their generic properties are given in Table 1. For each matrix, the problem 
size n and the number of nonzero entries in the lower triangular part nnz are pro- 
vided. In last two columns, the number of iterations (iters) and time required to 
solve the linear system using the conjugate gradient method without any scaling 
are given. The time was measured with the function etimeO and given in seconds. 
The stopping criterion 

\\b — AxAln s 



Table 1: First set of test problems information. 



matrix 


n 


nnz 


time 


iters 


BCSSTKll 


1473 


17857 


- 


t 


BCSSTK13 


2003 


42943 


- 


t 


BCSSTK15 


3948 


60882 


29.07 


9219 


BCSSTK21 


3600 


15100 


15.32 


7805 


BCSSTK38 


8032 


181746 


- 


t 


S1RMT3M1 


5489 


112505 


24.13 


4953 


S2RMT3M1 


5489 


112505 


- 


t 


S3RMT3M1 


5489 


112505 


- 


t 


S3RMT3M3 


5357 


106526 


- 


t 


MSC04515 


4515 


51111 


15.12 


4728 


EX15 


6867 


52769 


6.48 


1506 


KUU 


7102 


173651 


3.84 


550 



was used and the initial guess was taken to be the zero vector. For all the examples, 
the right hand side of each system was taken such that the exact solution is a vector 
with random entries uniformly distributed in (0, 1). No significant differences were 
observed for other choices of the right hand side vector. The maximum number of 
iterations was 10000. In all the tables a dagger ( j) indicates no convergence of the 
iterative method. 

We compare the numerical results of the new preconditioner with that of the 
SAINV preconditioner. The AINV and the SAINV algorithms have been widely 
compared with other preconditioning techniques, showing that they are the most 
effective algorithms for computing a sparse incomplete factorization of the inverse 
of a matrix [21 |3l El [Q |7|. For the SAINV algorithm we used the SAINV code 
of the SPARSLAB software provided by Tuma lj with drop tolerance r = 0.1. 
This drop tolerance is very often the right one based on the numerical results 
reported in several papers. For Algorithm 2, we used the parameters eps = 0.01 
and Ifil = 10. We also used a parameter Ifil such that the number of nonzero 
entries in the incomplete U factor divided by the number of nonzero entries in 
the upper triangular part of A, p, is approximately equal to or less than that of 



^http://www.cs.cas.cz/^tuma/sparslab.html 



Table 2: Setup time to compute sparse approximate inverse factors and results for 
the split-preconditioned CG algorithm. 





Algorithm 3 


SAINV Algorithm | 


matrix 


i/« 


P 


P-Its 


P-timc 


It-time 


T-timc 


P 


P-Its 


P-timc 


It-timc 


T-timc 


BCSSTKll 


13 

10 


0.58 
0.45 


628 
650 


0.23 
0.17 


1.20 
1.19 


1.43 
1.36 


0.58 


2099 


0.11 


3.95 


4.06 


BCSSTK13 


29 
10 


0.71 
0.26 


343 
550 


0.65 
0.10 


1.42 
1.80 


2.07 
1.90 


0.72 


370 


0.63 


1.55 


2.18 


BCSSTK15 


9 
10 


0.35 
0.37 


504 
491 


0.19 
0.20 


2.67 
2.73 


2.86 
2.93 


0.32 


214 


0.22 


1.11 


1.33 


BCSSTK21 


6 
10 


0.97 
1.48 


246 
164 


0.09 
0.13 


0.73 
0.52 


0.82 
0.65 


1.05 


167 


0.06 


0.5 


0.56 


BCSSTK38 


11 
10 


0.28 
0.25 


559 
572 


0.70 
0.64 


7.77 


8.42 
8.41 


0.29 


1131 


0.81 


15.67 


16.48 


S1RMT3M1 


15 
10 


0.41 
0.29 


244 
318 


0.45 
0.32 


2.28 
2.77 


2.73 
3.09 


0.83 


257 


0.98 


2.97 


3.95 


S2RMT3M1 


15 
10 


0.41 
0.28 


526 

585 


0.58 
0.34 


4.97 
5.11 


5.55 
5.45 


1.29 


539 


1.53 


7.5 


9.03 


S3RMT3M1 


15 
10 


0.36 
0.26 


1298 
1458 


0.65 
0.33 


11.89 
12.55 


12.54 
12.88 


2.81 


5434 


6.33 


117.80 


124.13 


S3RMT3M3 


15 
10 


0.37 
0.26 


984 
1087 


0.76 
0.31 


8.58 
8.95 


9.34 
9.26 


2.00 


5047 


3.83 


84.40 


88.23 


MSC04515 


13 

10 


0.65 
0.52 


712 
809 


0.28 
0.22 


4.16 
4.47 


4.44 
4.69 


0.66 


1020 


0.20 


5.93 


6.13 


EX15 


20 
10 


1.11 
0.65 


601 
511 


0.81 
0.34 


4.88 
3.53 


5.69 
3.87 


1.83 


1325 


0.55 


12.88 


13.43 


KUU 


8 
10 


0.19 
0.23 


141 
131 


0.38 
0.42 


1.70 
1.67 


2.08 
2.09 


0.18 


144 


0.25 


1.73 


1.98 



the SAINV preconditioner. The results of the split-preconditioned CG algorithm 
[15] in conjunction with the SAINV preconditioner and Algorithm 3 are given in 
Table 2. This table reports the density (p), the number of split-preconditioned 
CG iterations for convergence (P-Its), the setup time for the preconditioner (P- 
time), the time for the split-preconditioned iterations (It-time), and T-time which 
is equal to the sum of P-time and It-time. Numerical results presented in this 
table show that both algorithms are robust and the new method is better than 
the SAINV algorithm for 9 out of 12 problems, especially on the shell problems 
(S3RMT3M1 and S3RMT3M3). The resuhs of this table also indicate that the 
parameters eps = 0.01 and Ifil = 10 give good results. 

In Table 3, the numerical results for matrices BCSSTK13 and BCSSTK27 with 
different values of Ifil are given. This table shows the effect of an increase in Ifil 
on the reduction of the number of the iterations for convergence. The results of 
this table also indicate that the choices eps = 0.01 and Ifil = 10 lead to good 
results. 

In [3j, the numerical results of the SAINV preconditioner in conjunction with 
some preliminary transformations operated on the coefficient matrix such as sym- 
metric diagonal scaling, reordering with the multiple minimum degree (MMD) al- 
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Table 3: Results for matrices BCSSTK13 and BCSSTK21 with different values of 
Ifil 





BCSSTK13 


BCSSTK21 


Ifll 


P-time 


It-time 


P-Its 


P-time 


It-time 


P-Its 


2 


0.03 


3.02 


1039 


0.06 


0.81 


316 


4 


0.05 


2.77 


917 


0.08 


0.75 


270 


6 


0.06 


2.48 


793 


0.09 


0.72 


246 


8 


0.08 


2.19 


685 


0.10 


0.61 


198 


10 


0.11 


1.81 


550 


0.11 


0.53 


164 


12 


0.14 


1.69 


514 


0.14 


0.50 


148 


14 


0.17 


1.84 


529 


0.17 


0.50 


142 


16 


0.22 


1.78 


502 


0.20 


0.47 


125 



gorithm [13] and diagonally compensated reduction of positive off-diagonal entries 
(DCR), were given. The authors have concluded that the SAINV preconditioner in 
conjunction with symmetric diagonal scaling and reordering with MMD (J-MMD- 
SAINV) is often the best choice between the variants of the preconditioners used 
in [3j. In continuation, we use 14 out of 16 matrices used in [3] for the numerical 
experiments. Most of these matrices can be extracted from the Matrix Market web- 
site. The exceptions are NASA2910 and NASA4704 which can be downloaded from 
Tim Davis's collection, and the SMT matrix, which was provided by R. Kouhia of 
the Helsinki University of Technology q In Table 4, we give the numerical results 
of the new preconditioner in conjunction with symmetric diagonal scaling (J-N-M) 
and the J-MMD-SAINV preconditioner. Results of the J-MMD-SAINV precondi- 
tioner and the number of iterations of the conjugate gradient algorithm with Jacobi 
preconditioning (JCG-Its) were extracted from Table 7 and Table 1 in |3], respec- 
tively. It is necessary to mention that the parameter Ifil was chosen such that the 
parameter p of the new preconditioner is less than or approximately equal to that 
of the SAINV preconditioner. All assumptions and notations are as before. 

Table 4 shows that the new preconditioner is better than the J-MMD-SAINV 
preconditioner for 9 out of 14 matrices presented in this table. 

The last section of numerical experiments is devoted to some large matrices 



http://users.tkk.fi/~kouhia/sparsc.html 
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Table 4: Numerical results of the new method in conjunction with symmetric 
diagonal scaling and the J-MMD-SAINV preconditioner. 







.T~N-M 


.I-MMD-SAINV 




matrix 


n 


nnz 


If^l 


P 


P^Its 


P'timc 


It-time 


T-timc 


P 


P-Its 


,7CG-Its 


BCSSTK13 


2003 


42943 


17 


0.38 


275 


0.17 


0.94 


1.11 


0.39 


349 


1406 


BCSSTK14 


1806 


32630 


9 


0.28 


83 


0.06 


0.23 


0.29 


0.27 


73 


409 


BCSSTK15 


3948 


60882 


11 


0.32 


176 


0.17 


0.95 


1.12 


0.33 


167 


518 


BCSSTK16 


4884 


147631 


6 


0.12 


95 


0.20 


0.89 


1.09 


0.12 


98 


191 


BCSSTK17 


10974 


219812 


16 


0.40 


653 


1.19 


12.03 


13.22 


0.40 


711 


2522 


BCSSTK18 


11948 


80519 


8 


0.57 


515 


0.78 


5.73 


6.51 


0.58 


261 


1120 


BCSSTK21 


3600 


15100 


10 


1.46 


179 


0.11 


0.58 


0.69 


1.51 


191 


559 


BCSSTK25 


15439 


133840 


10 


0.59 


1614 


1.45 


26.86 


28.31 


0.57 


1512 


t 


S1RMQ4M1 


5489 


143300 


8 


0.19 


247 


0.27 


2.41 


2.68 


0.20 


248 


692 


S2RMQ4M1 


5489 


143300 


10 


0.23 


403 


0.33 


3.98 


4.31 


0.25 


528 


1529 


S3RMQ4M1 


5489 


143300 


13 


0.24 


569 


0.36 


5.66 


6.02 


0.24 


1140 


6884 


NASA2910 


2910 


88603 


20 


0.32 


262 


0.41 


1.64 


2.05 


0.95 


341 


1350 


NASA4704 


4704 


54730 


20 


0.85 


567 


0.47 


3.77 


4.24 


0.91 


1176 


4866 


SMT 


25710 


1889447 


15 


0.11 


734 


6.28 


74.83 


81.11 


0.11 


546 


1984 



Table 5: Numerical results of the new method in conjunction with symmetric 
diagonal scaling. 







New method with symmetrie diagonal scaling 


.7CG 1 


matrix 


n 


nnz 


lf^l 


P 


P-Its 


P-timc 


It-time 


T-time 


Its 


Time 


GRIDGENA 


48962 


280523 


10 
15 
20 


1.07 
1.52 
2.00 


609 
540 
464 


11.36 
11.98 
13.92 


29.85 
29.34 
27.89 


41.21 
41.32 
41.81 


1720 


48.61 


CVXBQPl 


50000 


199984 


10 
15 
20 


1.14 
1.50 
1.79 


1053 
886 
750 


11.38 
11.80 
12.28 


45.33 
41.00 
36.84 


56.71 
52.80 
49.12 


3330 


90.95 


APACHEl 


80800 


311492 


10 
15 
20 


1.43 

1.80 
2.22 


325 
245 
218 


29.44 
30.36 
31.44 


24.20 
19.48 
18.52 


53.64 
49.84 
49.96 


1796 


79.56 


CF2D2 


123440 


1605669 


10 
15 
20 


0.45 
0.65 
0.85 


1002 
850 
717 


68.63 

71.17 
73.88 


200.16 
179.8 
164.23 


268.79 
250.97 
238.11 


3870 


368.91 



extracted from Tim Davis's collection. Numerical results with different values of 
Ifil and with eps = 0.01 are given in Table 5. As can be seen, the new precon- 
ditioner in conjunction with symmetric diagonal scaling furnishes good results for 
large matrices. 

We end this section by giving the numerical results for the matrix APACHE2 ex- 
tracted from Tim Davis's collection. This is a large matrix of dimension n = 715176 
with nnz = 2776523 nonzero entries in the lower part. The conjugate gradient 
method in conjunction with symmetric diagonal scaling converges in 2482 itera- 
tions. The conjugate gradient method in conjunction with the new preconditioner 
and symmetric diagonal scaling with Ifil = 5 (p = 0.98) and Ifil = 10 (p = 1.52) 
converges in 839 and 575 iterations, respectively. Convergence history of these 
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Figure 1: Convergence history of the new preconditioner for the matrix APACHE2. 

methods is displayed in Figure 1. 

Numerical results for the matrix APACHE2 and matrices in Table 5 (and pre- 
vious tables) show that the new preconditioner reduces the number of iterations by 
about a factor of three. This is true for the J-MMD-SAINV preconditioner based 
upon a conclusion reported in ( p], page 1328). 

5. Conclusion and future work 

We have proposed an approach for computing a sparse incomplete factoriza- 
tion of the inverse of an SPD matrix. The resulting factorized sparse approximate 
inverse was used as a preconditioner for solving symmetric positive definite linear 
systems of equations by using the conjugate gradient algorithm. The new pre- 
conditioner does not need to specify the sparsity pattern of the inverse factor in 
advance. For preserving sparsity it is enough to specify two parameters eps and 
Ifil. Numerical results show that eps = 0.01 and Ifil = 10 usually give good 
results. Our numerical results also show that the proposed method in conjunction 
with the symmetric diagonal scaling is somewhat better than the J-MMD-SAINV 
preconditioner. 

The new preconditioner is suitable for parallel computers. It can also be imple- 
mented for normal equations with a little revision. 
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Future work may focus on extending the proposed preconditioner to general ma- 
trices and studying the effect of different reordering techniques on the convergence 
rate. 
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